275 research outputs found

    Compositional Embeddings Using Complementary Partitions for Memory-Efficient Recommendation Systems

    Full text link
    Modern deep learning-based recommendation systems exploit hundreds to thousands of different categorical features, each with millions of different categories ranging from clicks to posts. To respect the natural diversity within the categorical data, embeddings map each category to a unique dense representation within an embedded space. Since each categorical feature could take on as many as tens of millions of different possible categories, the embedding tables form the primary memory bottleneck during both training and inference. We propose a novel approach for reducing the embedding size in an end-to-end fashion by exploiting complementary partitions of the category set to produce a unique embedding vector for each category without explicit definition. By storing multiple smaller embedding tables based on each complementary partition and combining embeddings from each table, we define a unique embedding for each category at smaller memory cost. This approach may be interpreted as using a specific fixed codebook to ensure uniqueness of each category's representation. Our experimental results demonstrate the effectiveness of our approach over the hashing trick for reducing the size of the embedding tables in terms of model loss and accuracy, while retaining a similar reduction in the number of parameters.Comment: 11 pages, 7 figures, 1 tabl

    Methods for Quantized Compressed Sensing

    Get PDF
    In this paper, we compare and catalog the performance of various greedy quantized compressed sensing algorithms that reconstruct sparse signals from quantized compressed measurements. We also introduce two new greedy approaches for reconstruction: Quantized Compressed Sampling Matching Pursuit (QCoSaMP) and Adaptive Outlier Pursuit for Quantized Iterative Hard Thresholding (AOP-QIHT). We compare the performance of greedy quantized compressed sensing algorithms for a given bit-depth, sparsity, and noise level

    Optimizing quantization for Lasso recovery

    Get PDF
    This letter is focused on quantized Compressed Sensing, assuming that Lasso is used for signal estimation. Leveraging recent work, we provide a framework to optimize the quantization function and show that the recovered signal converges to the actual signal at a quadratic rate as a function of the quantization level. We show that when the number of observations is high, this method of quantization gives a significantly better recovery rate than standard Lloyd-Max quantization. We support our theoretical analysis with numerical simulations

    A Distributed Data-Parallel PyTorch Implementation of the Distributed Shampoo Optimizer for Training Neural Networks At-Scale

    Full text link
    Shampoo is an online and stochastic optimization algorithm belonging to the AdaGrad family of methods for training neural networks. It constructs a block-diagonal preconditioner where each block consists of a coarse Kronecker product approximation to full-matrix AdaGrad for each parameter of the neural network. In this work, we provide a complete description of the algorithm as well as the performance optimizations that our implementation leverages to train deep networks at-scale in PyTorch. Our implementation enables fast multi-GPU distributed data-parallel training by distributing the memory and computation associated with blocks of each parameter via PyTorch's DTensor data structure and performing an AllGather primitive on the computed search directions at each iteration. This major performance enhancement enables us to achieve at most a 10% performance reduction in per-step wall-clock time compared against standard diagonal-scaling-based adaptive gradient methods. We validate our implementation by performing an ablation study on training ImageNet ResNet50, demonstrating Shampoo's superiority over standard training recipes with minimal hyperparameter tuning.Comment: 38 pages, 8 figures, 5 table

    ESRRB regulates glucocorticoid gene expression in mice and patients with acute lymphoblastic leukemia

    Get PDF
    Synthetic glucocorticoids (GCs), such as dexamethasone and prednisone, remain key components of therapy for patients with lymphoid malignancies. For pediatric patients with acute lymphoblastic leukemia (ALL), response to GCs remains the most reliable prognostic indicator; failure to respond to GC correlates with poor event-free survival. To uncover GC resistance mechanisms, we performed a genome-wide, survival-based short hairpin RNA screen and identified the orphan nuclear receptor estrogen-related receptor-beta (ESRRB) as a critical transcription factor that cooperates with the GC receptor (GR) to mediate the GC gene expression signature in mouse and human ALL cells. Esrrb knockdown interfered with the expression of genes that were induced and repressed by GR and resulted in GC resistance in vitro and in vivo. Dexamethasone treatment stimulated ESRRB binding to estrogen-related receptor elements (ERREs) in canonical GC-regulated genes, and H3K27Ac Hi-chromatin immunoprecipitation revealed increased interactions between GR- and ERRE-containing regulatory regions in dexamethasone-treated human T-ALL cells. Furthermore, ESRRB agonists enhanced GC target gene expression and synergized with dexamethasone to induce leukemic cell death, indicating that ESRRB agonists may overcome GC resistance in ALL, and potentially, in other lymphoid malignancies

    The New Generation Atlas of Quasar Spectral Energy Distributions from Radio to X-rays

    Get PDF
    We have produced the next generation of quasar spectral energy distributions (SEDs), essentially updating the work of Elvis et al. (1994) by using high-quality data obtained with several space and ground-based telescopes, including NASA's Great Observatories. We present an atlas of SEDs of 85 optically bright, non-blazar quasars over the electromagnetic spectrum from radio to X-rays. The heterogeneous sample includes 27 radio-quiet and 58 radio-loud quasars. Most objects have quasi-simultaneous ultraviolet-optical spectroscopic data, supplemented with some far-ultraviolet spectra, and more than half also have Spitzer mid-infrared IRS spectra. The X-ray spectral parameters are collected from the literature where available. The radio, far-infrared, and near-infrared photometric data are also obtained from either the literature or new observations. We construct composite spectral energy distributions for radio-loud and radio-quiet objects and compare these to those of Elvis et al., finding that ours have similar overall shapes, but our improved spectral resolution reveals more detailed features, especially in the mid and near-infrared.Comment: 46 pages, 10 figures, 10 tables, Accepted by ApJS. Composite SED data files for radio-loud and radio-quiet quasars (rlmsedMR.txt, rqmsedMR.txt) are included in the source (Other formats -> Source). Supplemental figures are not include

    Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species

    Get PDF
    Background: The process of generating raw genome sequence data continues to become cheaper, faster, and more accurate. However, assembly of such data into high-quality, finished genome sequences remains challenging. Many genome assembly tools are available, but they differ greatly in terms of their performance (speed, scalability, hardware requirements, acceptance of newer read technologies) and in their final output (composition of assembled sequence). More importantly, it remains largely unclear how to best assess the quality of assembled genome sequences. The Assemblathon competitions are intended to assess current state-of-the-art methods in genome assembly. Results: In Assemblathon 2, we provided a variety of sequence data to be assembled for three vertebrate species (a bird, a fish, and snake). This resulted in a total of 43 submitted assemblies from 21 participating teams. We evaluated these assemblies using a combination of optical map data, Fosmid sequences, and several statistical methods. From over 100 different metrics, we chose ten key measures by which to assess the overall quality of the assemblies. Conclusions: Many current genome assemblers produced useful assemblies, containing a significant representation of their genes and overall genome structure. However, the high degree of variability between the entries suggests that there is still much room for improvement in the field of genome assembly and that approaches which work well in assembling the genome of one species may not necessarily work well for another
    corecore